Speech-Driven Facial Reenactment Using Conditional Generative Adversarial Networks

نویسندگان

  • Seyed Ali Jalalifar
  • Hosein Hasani
  • Hamid Aghajan
چکیده

We present a novel approach to generating photo-realistic images of a face with accurate lip sync, given an audio input. By using a recurrent neural network, we achieved mouth landmarks based on audio features. We exploited the power of conditional generative adversarial networks to produce highly-realistic face conditioned on a set of landmarks. These two networks together are capable of producing sequence of natural faces in sync with an input audio track.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Quality Face Image SR Using Conditional Generative Adversarial Networks

We propose a novel single face image superresolution method, which named Face Conditional Generative Adversarial Network(FCGAN), based on boundary equilibrium generative adversarial networks. Without taking any facial prior information, our method can generate a high-resolution face image from a low-resolution one. Compared with existing studies, both our training and testing phases are end-toe...

متن کامل

Conditional generative adversarial nets for convolutional face generation

We apply an extension of generative adversarial networks (GANs) [8] to a conditional setting. In the GAN framework, a “generator” network is tasked with fooling a “discriminator” network into believing that its own samples are real data. We add the capability for each network to condition on some arbitrary external data which describes the image being generated or discriminated. By varying the ...

متن کامل

Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification

Improving speech system performance in noisy environments remains a challenging task, and speech enhancement (SE) is one of the effective techniques to solve the problem. Motivated by the promising results of generative adversarial networks (GANs) in a variety of image processing tasks, we explore the potential of conditional GANs (cGANs) for SE, and in particular, we make use of the image proc...

متن کامل

Improvement of generative adversarial networks for automatic text-to-image generation

This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...

متن کامل

Generative Adversarial Network-Based Glottal Waveform Model for Statistical Parametric Speech Synthesis

Recent studies have shown that text-to-speech synthesis quality can be improved by using glottal vocoding. This refers to vocoders that parameterize speech into two parts, the glottal excitation and vocal tract, that occur in the human speech production apparatus. Current glottal vocoders generate the glottal excitation waveform by using deep neural networks (DNNs). However, the squared error-b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018